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ABSTRACT 

This article provides a comprehensive investigation on the relations 
between virality of news articles and the emotions they are found 
to evoke. Virality, in our view, is a phenomenon with many facets, 
i.e. under this generic term several different effects of persuasive 
communication are comprised. By exploiting a high-coverage and 
bilingual corpus of documents containing metrics of their spread on 
social networks as well as a massive affective annotation provided 
by readers, we present a thorough analysis of the interplay between 
evoked emotions and viral facets. 

We highlight and discuss our findings in light of a cross-lingual 
approach: while we discover differences in evoked emotions and 
corresponding viral effects, we provide preliminary evidence of 
a generalized explanatory model rooted in the deep structure of 
emotions: the Valence-Arousal-Dominance (VAD) circumplex. We 
find that viral facets appear to be consistently affected by particular 
VAD configurations, and these configurations indicate a clear con¬ 
nection with distinct phenomena underlying persuasive communi¬ 
cation. 
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I. INTRODUCTION 

The mass-adoption of social networking sites and the wide in¬ 
tegration of sharing widgets on popular websites have paved the 
way for quantitative research efforts tackling the relations between 
content and virality. Very recently, a different kind of widgets is 
working its way through the online space: designed to take the af¬ 
fective pulse of the websites visitors, such small interfaces allow 
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people to explicitly tag their emotional states as they browse the 
web. 

Although the adoption of the latter is not yet prominent, we 
found two very popular news outlets, one in English and one in 
Italian, embedding similar interfaces for affective feedback in each 
article page. We thus became interested in investigating the relation 
of this newly available affective data with virality of news articles 
in a cross-lingual experimental setting. The two online websites 
which made this research possible are: 

1. Rappler (rappler. com), an English-written “social news” 
portal, which makes extensive use of its distinctive Mood 
Meter feature; 

2. the online version of the Italian newspaper Corriere della 
Sera (corriere . it), one of the most popular daily news¬ 
papers in Italy. 

In our previous work [30], we have shown the impressive value 
of such reader-provided affective data: we used Rappler data to au¬ 
tomatically build an emotion lexicon and a system for automatic af¬ 
fective analysis of texts. In the evaluation results we outperformed 
existing systems even using a naive approach, thanks to the quality 
and coverage of the human annotation of evoked emotions. In this 
work, we turn to analyze such data along with indices of virality 
in order to derive insights on the relations between the emotions 
evoked by textual content and its diffusion and engagement. 

Early works on emotions and virality have already been pub¬ 
lished [4], paving the way for this novel line of research. Still, a few 
limitations of [4], which this paper attempts to overcome, can be 
identified in (i), the use of a small sample of articles (~7,000) with 
only a subset manually annotated by three annotators (~2,500 doc¬ 
uments); (ii), the authors only consider one virality index; and (iii), 
they only hint to a possible explanatory role of deep constituents of 
emotions. 

In this paper we leverage a large corpus of news articles (ten 
times bigger than [4]) with a massive crowd-sourced voluntary an¬ 
notation of the emotion each article evokes in readers (more than 
1.5 millions votes), so to: 

• compare several viral phenomena at once and understand if 
they are consistently affected by emotions and if they are af¬ 
fected in the same way; 

• compare emotion and virality in a cross-lingual experimental 
setting; 

• investigate the effect of deep constituents of emotions in viral 
phenomena. 

Our findings show that, while the relations between emotions 
and virality seem to vary across cultures, their deeper constituents 



(i.e. the Valence, Arousal, and Dominance components, as defined 
in the VAD circumplex model of affect we adopt) show consis¬ 
tency among all indices of virality we account for, providing a gen¬ 
eralized model for their interplay. More specifically, viral indices 
are coherently affected by particular VAD configurations, and these 
configurations point to a clear connection with general phenomena 
underlying different viral facets. These results are relevant not only 
for social science researchers interested in understanding the fac¬ 
tors behind virality phenomena, but also for marketing and indus¬ 
try people as they can be very valuable in contexts such as content 
marketing and native advertising [13]. 

This paper is structured as follows: the following section pro¬ 
vides the reader with a brief review of recent research efforts on 
virality, social media and emotion studies; in Section 3 we describe 
the data collection procedure followed; Sections 4 and 5 report our 
analyses and findings, while in Section 6 we provide a comparison 
with [4]. Finally, we take stock of our work in Section 7. 

2. RELATED WORK 

In this section we provide a short review of research efforts fo¬ 
cused on (i) understanding content virality on Social Media and (ii) 
the role of emotions in Social Media. 

Content Virality. Several researchers have studied information 
flow, community building and similar processes using Social Net¬ 
working sites as a reference [1, 18, 19, 23]. However, the great 
majority focused on network-related features without taking into 
account the actual content spreading within the network [22]. A hy¬ 
brid approach focusing on both product characteristics and network 
related features is presented in [3]: the authors study the effect of 
passive-broadcast and active-personalized notifications embedded 
in an application aimed at fostering word of mouth. 

Recently, the relation between content characteristics and viral¬ 
ity has begun to be investigated, especially with regard to textual 
content. In [17], for instance, features derived from sentiment anal¬ 
ysis of comments are used to predict the popularity of stories. The 
relevant work in [7] measures a different form of content spreading, 
by analyzing which are the features of a movie quote that make it 
“memorable" online. Another approach to content virality, some¬ 
how complementary to the previous one, is presented in [29], where 
the authors investigate which modification dynamics make a meme 
spread from one person to another (as compared with movie quotes 
which spread remaining exactly the same). Louis and Nenkova [24] 
focused on influential scientific articles in newspapers, considering 
characteristics such as readability, description vividness, use of un¬ 
usual words and affective content, comparing high quality articles 
(NYT articles appearing in “The Best American Science Writing" 
anthology) against typical NYT articles. 

Moreover, the work presented in [5] investigates how differences 
in textual description affect the spread of content-controlled videos. 
In [21], the authors focus on the act of resubmissions (i.e., content 
that is submitted multiple times with multiple titles to multiple dif¬ 
ferent online communities) to understand the extent to which each 
factor influences the success of a content. In [32] it is investigated 
how content spreads in an online community by pinpointing the 
effect of wording in terms of content informativeness, generality 
and affect. Finally, Althoff et al. [2] developed a model that can 
predict the success of requests for a free pizza gifted from the Red- 
dit community, using high level textual features such as politeness, 
reciprocity, narrative and gratitude. 

More recently, some works have tried to investigate how differ¬ 
ent textual contents give rise to different reactions in the audience: 
the work presented in [12] correlates several viral phenomena with 
the wording of a post, while in [10] it is shown that specific content 


features variations (like the readability level of an abstract) differ¬ 
entiate among virality level of downloads, bookmarking, and cita¬ 
tions. Similarly, Shuai et al. [28] studied scientific articles in terms 
of downloads, Twitter mentions, and early citations in the scholarly 
records, trying to understand how these virality indices correlate 
among them. Finally, also in the realm of visual content it has been 
shown that different image characteristics can give rise to different 
viral phenomena [11]. Following this line of research, we study the 
effects of emotions considering several audience reactions to find 
out whether there are peculiar emotional characteristics of a news 
article that give rise to different viral reactions. 

Emotions in Social Media. Kramer et al. [20] have shown, via 
a massive experiment on Facebook, that emotional states can be 
transferred to others via emotional contagion, leading people to 
experience the same emotions without their awareness. The ex¬ 
periment included reducing the amount of emotional content in the 
News Feed of a user: when positive expressions were reduced, peo¬ 
ple produced fewer positive posts and more negative posts; when 
negative expressions were reduced, the opposite pattern occurred. 
While in [20] the authors manipulated News Feed content, in our 
study we simply analyze existing and publicly available content 
voluntarly annotated by readers. 

In [16], the authors focused on the role of emotions in the con¬ 
text of word-of-mouth marketing, using a dataset of Google-l- posts: 
their analyses show consistency with [4], with increase in ANGER 
linked to higher likelihood of reshares, while the opposite trend 
holds for SADNESS. Fan et al. [9] looked at diffusion patterns on 
the very popular Chinese platform Weibo, which shares many fea¬ 
tures with Twitter, and again found a similar result: angry posts 
appear to spread at a significantly faster rate, while sad posts do so 
at a significant lower rate. 

Furthermore, the work presented in [14] comes closer to the fo¬ 
cus of the present paper by hypothesizing that negative news con¬ 
tent is more likely to be retweeted, while for non-news tweets pos¬ 
itive sentiments support virality. To test this hypothesis the authors 
analyze three corpora and give evidence that negative sentiment 
enhances virality in the news segment, but not in the non-news seg¬ 
ment. Their conclusion is that the relation between affect and vi¬ 
rality is more complex than expected based on the findings of [4]. 

In [26], popular and influential users on twitter are linguistically 
analized according to their tweets. The initial hypothesis is that a 
Twitter account cannot be simply traced back to the graph proper¬ 
ties of the network within which it is embedded, but also depends 
on the personality and emotions of the human being behind it. The 
reported findings suggest that popular users tend to use positive 
emotions while influential users lean towards negative ones. 

Finally, the work presented in [4] uses New York Times articles to 
examine the relationship between emotions evoked by the content 
and virality, using semi-automated sentiment analysis to quantify 
the affectivity and emotionality of each article. Results suggest 
a strong relationship between affect and virality; still, the virality 
metric considered is interesting but very limited: it only consists 
of how many people emailed the article. We consider this work as 
a starting point for our research and for comparison of results: we 
will discuss it throughout the paper. 

Human, crowd-sourced, voluntary, large scale annotations. 
These distinguishing characteristics of the affective data employed 
in our analyses mark the significant difference between this article 
and previous works. While previously mentioned research efforts 
have resorted to computational linguistics classifiers to automati¬ 
cally annotate textual content, the data we crawled (from publicly 
available websites) allows us to leverage affective annotations vol¬ 
untarily provided by the news article readers. 



3. DATASET COLLECTION 

The “social-news” website rappler. com embeds a small in¬ 
terface, called Mood Meter, in every article it publishes. Such in¬ 
terface allows the readers to express with a simple click their emo¬ 
tional reaction to the story they are reading. The percentages of 
votes obtained by each emotion are also visualized by the inter¬ 
face, which is depicted in Figure 1. Similarly, the online version 
of a very popular Italian newspaper (Corriere della Sera) has re¬ 
cently adopted a similar approach, based on emoticons, to sense 
the emotional states of its readers, as shown in Figure 2'. 


( 


RAPPLER 

MOOD METER 


> 




THIS STORY MAKES PEOPLE INSPIRED 


HOW DID THIS STORY MAKE YOU FEEL? 


Happy I Sad I Angry I Don't Care 


Inapired I Afraid I Amused I Annoyed 


Figure 1: Rappler’s Mood Meter. 


to extract absolute votes from corriere.it, along with percent¬ 
age values, rappler. com only exposes the latter. In particular, 
for the bulk of articles crawled from corriere . it, our data in¬ 
cludes a total of 320,697 votes for the five emotional dimensions 
available. 

Although we have no means to verify the actual absolute number 
of votes collected by the Mood Meter, we can provide a very con¬ 
servative estimate for it: by computing the lower common denomi¬ 
nator over the percentages of affective votes obtained by a Rappler 
article, we can derive the minimum number of votes needed to ob¬ 
tain them, compounding to a total of 1,145,543 votes over the entire 
Rappler dataset. For comparison, consider that the same conserva¬ 
tive estimate for the corriere. it data amounts to 210,113 - 
less than two thirds of the actual value depicted above. 

Thus, the datasets used in this work comprise more than 65,000 
news articles and more than 1.5 million annotations. 

Since the sets of emotions accounted for in the two websites dif¬ 
fer both in size (rappler. com allows to tag eight affective di¬ 
mensions, whereas five are available on corriere. it) and, al¬ 
though slightly, in semantics (being in two different languages), we 
need to proceed to map the subset available from the latter to the 
former, as reported in Table 1. 


corriere . it label 

maps to 

Triste 

Sad 

Divertito 

Amused 

SODDISFATTO 

Happy* 

Preoccupato 

Afraid* 

INDIGNATO 

Annoyed* 


Table 1: Mapping of original emotion labels from 
corriere . it to those present in rappler. com. * de¬ 
notes appropriate mapping, altongh the English label is not the 
primary translation. 


DOPO AVER lETTO QUESTO ARTICOLO Ml SENTO... 


© O © © 0 


Fignre 2: The emoticon-based interface on corriere. it ar¬ 
ticles. The sentence translates to “After reading this article, I 
feel..” 

Following previous work presented in [30], we harvested a total 
of 53,226 news articles from rappler . com, and 12,437 articles 
from corriere.it. Our crawler was programmed to retrieve ar¬ 
ticles at least a month old, in order to allow the indicators to settle 
and thus not to penalize the most recent articles. The articles span 
over roughly one year for Rappler, and nine months for Corriere 
della Sera. For the scope of this paper, the main difference be¬ 
tween the two mechanisms lies in the fact that, while it is possible 

'As we are forced to adopt a crawling approach, we cannot have 
any control on the layout of the widgets shown above. Thus, we 
cannot mle out the possibility of biases arising from the (fixed) 
order used to present readers with affective labels and emoticons. 
Still, the results reported in [30] - obtained thanks to data crawled 
with the same strategy, indicate that such bias, if present, is negli¬ 
gible. 


In Table 2 we report the mean percentage of votes for each emo¬ 
tional dimension on the two corpora: it can be seen that HAPPINESS 
has the highest percentage of votes by a large margin in comparison 
to the other dimensions in rappler . com, while it is very closely 
followed by INDIGNATO/ANNOYED in the corriere . it data. 


Rappler^ Corriere^ 


Afraid 

.05 

.09 

Dont_Care 

.05 

- 

Amused 

.11 

.11 

Happy 

.31 

.29 

Angry 

.11 

- 

Inspired 

.11 

- 

Annoyed 

.06 

.25 

Sad 

.12 

.10 


Table 2: Mean vote percentages obtained by emotions in Rap¬ 
pler and Corriere. 

The statistics on rappler.com emotional data confirm the 
trend noted in [30] about HAPPINESS predominance, for which 
several explanations may be hypothesized: from cultural charac¬ 
teristics, to a bias in the dataset itself - as it might contain mainly 
‘positive’ news, through psychological phenomena leading people 
to express more positive moods on social networks [8, 26, 33]. 

Studies on other English datasets, e.g. on LiveJournal posts [31], 
have in the past noted predominance of the happy mood. Con- 


























Rappler Comments Rappler Tweets RapplerG+ 



Estimate 

Std. Err 

Sigf. 


Estimate 

Std. Err 

Sigf. 


Estimate 

Std. Err 

Sigf. 

ANNOYED 

.61 

.03 

*** 

INSPIRED 

.52 

.02 

*** 

INSPIRED 

.55 

.02 

*** 

ANGRY 

.54 

.02 

*** 

ANGRY 

.32 

.02 

*** 

ANGRY 

.25 

.02 

*** 

INSPIRED 

.23 

.02 

*** 

ANNOYED 

.32 

.03 

*** 

HAPPY 

.24 

.02 

*** 

DONT_CARE 

.18 

.04 

*** 

HAPPY 

.31 

.02 

*** 

AMUSED 

.21 

.02 

*** 

AMUSED 

.13 

.02 


DONT_CARE 

.30 

.04 

*** 

ANNOYED 

.20 

.03 

*** 

HAPPY 

.10 

.02 


SAD 

.24 

.02 

*** 

SAD 

.16 

.02 

*** 

SAD 

.07 

.02 

*** 

AMUSED 

.21 

.02 

*** 

AFRAID 

.14 

.03 

*** 

AFRAID 

-.03 

.03 

t 

AFRAID 

.19 

.03 

*** 

DONT_CARE 

.13 

.04 

*** 


Table 3: Emotions impact on rappler.com articles viral Indices. ***: p <.001; **: p <.01; *: p <.05; f: not significant - this 
legend applies to all tables in this paper. 



Corriere Comments 



Corriere Tweets 



Corriere G4- 




Estimate 

Std. Err 

Sigf 


Estimate 

Std. Err 

Sigf 


Estimate Std. Err 

Sigf 

ANNOYED 

I.II 

.04 

*** 

ANNOYED 

.33 

.03 

*«* 

SAD 

.57 

.05 

*** 

HAPPY 

.40 

.04 

*** 

SAD 

.20 

.05 

*** 

ANNOYED 

.39 

.03 

*** 

AFRAID 

.19 

.06 

** 

HAPPY 

.14 

.03 

*** 

AMUSED 

.34 

.05 

*** 

AMUSED 

.13 

.06 

* 

AFRAID 

.06 

.05 

t 

AFRAID 

.33 

.05 

*** 

SAD 

.12 

.05 

* 

AMUSED 

.02 

.05 

t 

HAPPY 

.27 

.03 

*** 


Table 4: Emotions impact on corriere. it articles viral indices. 


versely, no previous studies have dealt at this scale with Italian 
language, and it will be worth investigating in future works what 
factors may influence the trend shown for emotional dimensions in 
the corriere . it data. 

Turning to the statistics of the various viral indices available for 
the two datasets, we provide a summary in Table 5. 


Viral Index 

Rappler^ 

Corriere,j 

Comments 

4.11 

83.22 

Threads 

2.81 

- 

G+ shares 

.91 

3.79 

Twitter shares 

32.33 

40.65 

Facebook shares 

- 

502.93 


Table 5: Mean figures for virality indices. 

From now on, we will only consider the intersection of the two 
sets of indices. 

3.1 Narrow- and Broad- casting 

In the literature, broadcasting refers to the act of communicating 
or transmitting a content to numerous recipients simultaneously, 
over a communication network; on the contrary, narrowcasting has 
traditionally been understood as the dissemination of information 
to a narrow audience, rather than to the broader public at large. 
With the advent of new media, this definition as been updated (see, 
for instance, [15]). In fact, from the perspective of readers of online 
content, they can decide whether and how to share such content by 
either broadcasting it to their audience or to address only a small 
portion of it (narrowcasting). In [4], for example, the act of for¬ 
warding a news article by email is treated as a form of narrowcast¬ 
ing, since in such case subjects are addressing a selected section of 
their audience/contacts. 

Following the above definition, we will consider article com¬ 
ments as a form of narrowcasting, as the readers who upload a 
comment on the article page are contributing to a discussion which 
happens at most between the readers of that article (and most prob¬ 
ably, in fact, to a small subset of it). On the other hand, we consider 
the act of sharing an article to social networking sites as a form of 
broadcasting. 

The rationale behind this distinction of viral facets in narrow¬ 
casting and broadcasting lies in previous research efforts showing 


that virality can be assessed through different metrics which repre¬ 
sent distinct phenomena: it is not only the magnitude of spreading 
(virality) that depends on content but also the users reactions (com¬ 
ments, shares, tweets, etc.) [10, 11, 28]. 

In the following sections, all virality indices have been standard¬ 
ized in order to make them comparable {p — 0,a — 1) across the 
two datasets. Emotion scores range between 0 and 1. 

4. EMOTION ANALYSIS 

To analyze the relationship between an article emotional char¬ 
acteristics and the corresponding impact on virality indices we use 
simple linear models. In this section we will discuss the importance 
of each emotion for the various virality indices, while the overall 
explanatory power of the models will be discussed in Section 6. 

Results on the rappler. com dataset, shown in Table 3, are in 
accordance with the findings of Berger ef al. [4]: high influence of 
INSPIRING (awe) and of negative-valence and high-arousal emo¬ 
tions such as ANGER, jointly with a low influence of SADNESS. 
This similarity stands out for the specific case of narrowcasting, 
and it is maintained for broadcasting, albeit to a lower extent. 

As mentioned in Section 2, other previous research works [9, 
16], focusing on diverse scenarios and using heterogeneous datasets, 
have reported results consistent with these findings. 

Nonetheless, turning to the Italian resource used in our analy¬ 
ses, we notice that results obtained from the corriere . it data, 
summarized in Table 4, are partially in line with [4] for what con¬ 
cerns narrowcasting, while drastically diverge on broadcasting: 
in the latter case, SADNESS (a low-arousal emotional dimension) 
is found to be most relevant for virality, disproving the hypothe¬ 
sis that arousal alone can be used to explain virality phenomena. 
It should be noted that the lack of an INSPIRED dimension for 
corriere.it cannot account for such discrepancy on broad¬ 
casting, since it seems very unlikely that its potential votes would 
have been collected by SADNESS. 

It can be hypothesized, as a consequence, that strong cultural 
differences emerge when it comes to emotions (more precisely, to 
explicitly tagging one’s own feeling on a website). What factors 
might underlie this phenomenon (e.g. historical period, editorial 
choices, deeper cultural sensibilities, etc.) represents a very fasci¬ 
nating research question, which is out of the scope of this work. 

Rather than focusing on specific emotions, in the next section we 
attempt to provide a more general explanation. 



corriere. it comments 



Estimate 

Std. Err 

Sigf 

AROUSAL 

.3741 

.0372 


VALENCE 

-.3155 

.0486 


DOMINANCE 

.1000 

.0761 

t 

corriere . it tweets 


Estimate 

Std. Err 

Sigf 

DOMINANCE 

.2158 

.0709 

** 

VALENCE 

-.2176 

.0453 


AROUSAL 

.0400 

.0348 

t 

corriere . it g-|- shares 


Estimate 

Std. Err 

Sigf 

DOMINANCE 

.4817 

.0704 


VALENCE 

-.3781 

.0450 


AROUSAL 

-.0242 

.0345 

t 


rappler. com comments 



Estimate 

Std. Err 

Sigf 

AROUSAL 

.1205 

.0101 


VALENCE 

-.1636 

.0165 


DOMINANCE 

.0728 

.0221 

** 

rappler. com tweets 


Estimate 

Std. Err 

Sigf 

DOMINANCE 

.2244 

.0221 


VALENCE 

-.1495 

.0165 


AROUSAL 

.0065 

.0101 

t 

rappler . com g-f shares 


Estimate 

Std. Err 

Sigf 

DOMINANCE 

.2619 

.0221 


VALENCE 

-.1530 

.0165 


AROUSAL 

-.0371 

.0101 



Table 7: Valence, Arousal and Dominance significant effects from simple Linear Models. 


EMOTION 

Valence 

Arousal 

Dominance 

AFRAID 

2.25 

5.12 

2.71 

AMUSED 

7.05 

4.27 

5.93 

ANGRY 

2.53 

6.02 

4.11 

ANNOYED 

2.80 

5.29 

4.08 

DONT_CARE 

3.53 

4.27 

3.62 

HAPPY 

8.47 

6.05 

7.21 

INSPIRED 

6.89 

5.56 

7.30 

SAD 

2.10 

3.49 

3.84 


Table 6: Valence, Arousal and Dominance scores for emotion 
labels, as provided by [34]. 


5. VAD ANALYSIS 

We now turn to investigate how basic constituents of emotions, 
such as Valence, Arousal, and Dominance (VAD), connect to vi- 
rality. The widely adopted VAD circumplex model of affect [6, 
27] maps emotions on a three-dimensional space, namely: Valence, 
denoting the degree of positive/negative affectivity (e.g. FEAR has 
high negative valence, while JOY has high positive valence); Arousal, 
ranging from calming to exciting (e.g. ANGER is denoted by high 
arousal while SADNESS by low arousal); and Dominance, going 
from “controlled” to “in control” (e.g. INSPIRED, highly in con¬ 
trol, vs EEAR, overwhelming). 

We thus proceed to map the emotions an article is found to evoke 
to the VAD circumplex model. To do so, we exploit the work of 
Warriner et al. [34], who provided a resource mapping roughly 14 
thousands words to the VAD circumplex model, including words 
representing the emotional dimensions we consider in this work - 
see Table 6, which for us serve as a gold standard. Such scores are 
in a Likert scale, ranging from 1 (low/negative) to 9 (high/positive). 

Then, in order to compute VAD scores for a given article doc we 
simply multiply the percentage of votes each emotion e is found to 
evoke in doc by the corresponding VAD score provided in Table 6, 
and then take the sum over the n emotional dimensions considered. 
The formula for Valence is provided below; the equations used for 
Arousal and Dominance are akin. 


docv = votes%{ei) x valence{ei) (1) 

i = l 


As with vitality indices, each VAD dimension has been stan¬ 
dardized. Subsequently, we build linear models in order to as¬ 
sess whether and how VAD dimensions are connected to vital¬ 
ity. The results reported in Table 7 show very interesting trends: 
VAD dimension estimates are found to be consistent among the 
two datasets (see estimate order and sign in Table 7), in both broad¬ 
casting (tweets/g-l- shares) and narrowcasting (comments) scenar¬ 
ios. Comparing these results with those provided in the previous 
sections, we see that the cross-cultural divergences in the relations 
between emotional dimensions and vitality indices disappear when 
accounting for their more profound VAD components. 

This finding hints at a generalized, culturally indipendent phe¬ 
nomenon: readers of the articles in our datasets tend to choose 
communication forms of narrowcasting when the content is arous¬ 
ing hut on which they feel less in control^. 

Conversely, they turn to broadcasting (e.g. share on social net¬ 
works) when they feel more in control. This result is further sup¬ 
ported by the fact that the less important dimensions (i.e. Domi¬ 
nance for narrowcasting, and Arousal for broadcasting) are most 
of the time not significant, or with a slight negative impact on the 
model, indicating that these two dimensions switch roles when tran¬ 
sitioning from narrowcasting to broadcasting (and viceversa). This 
finding is important as it appears to be valid, in our analyses, among 
the two different languages (and cultural factors) our two datasets 
provide. 

Finally, the role of Valence appears to be consistent among all 
indices of vitality in both datasets, with negative valence contribut¬ 
ing to higher vitality. This result is in line with [14], who found 
that news-related content spreads more when imbued with negative 
Valence. 


6. R2 ANALYSIS 

In this section we examine the explanatory power of our mod¬ 
els and compare them with the results obtained by the models pre¬ 
sented in Berger et al. [4] - in particular, with model 2 (Positivity 
and Emotionality) and model 3 (Positivity and Emotionality plus 
the emotions Awe, Anger, Anxiety, Sadness). 


^Again, this result is in line with the intuition in [4], that arousal 
has a strong effect on narrowcasting. Nonetheless, it appears of 
paramount importance to take the role of Dominance (not consid¬ 
ered in [4]) into account in order to understand broadcasting phe¬ 
nomena. 



We compare our VAD models with model 2, which was automat¬ 
ically computed starting from a lexicon of relevant affective words 
(the LIWC lexicon described in [25]), and our emotion-based mod¬ 
els with model 3 in [4]. It should be noted that the extensive work 
presented in [4] includes other models accounting for variables 
(such as publication time, homepage position, author, among oth¬ 
ers) not available in our datasets. Nonetheless, it has been shown 
in [4] that, while augmenting the explanatory power, these variables 
do not influence weight and role of the emotions in the models. 

In [4], the dependent variable is represented as binary (i.e., 1 for 
those articles that make it on the most emailed list, 0 for all the oth¬ 
ers). On the contrary, our viral indices are originally represented as 
continuous variables. In order to provide a fair comparison we thus 
proceeded to binarize the vitality indices V of article n, according 
to the following scheme: 

1, if > mean{V) -I- sd{V) 

0, otherwise 

In this way, we obtain a small sample of the most viral articles 
in our dataset, specular to the most emailed list in [4]. We also 
used logistic regression in accordance with [4]. For comparison, 
we report in table 8 the McFadden’s used by Berger et al. [4], 
along with the standard . 


Viral Index 


McFadden’s 

corriere . it emotion models 

Comments 

.0813 

.0922 

G-i- 

.0207 

.0527 

Tweets 

.0084 

.0604 

corriere . it VAD models 

Comments 

.0530 

.0840 

G-i- 

.0195 

.0446 

Tweets 

.0072 

.0591 

rapplei 

:. com emotion models 

Comments 

.0191 

.0801 

G-i- 

.0115 

.0314 

Tweets 

.0112 

.0300 

rappler.com 

VAD models 

Comments 

.0101 

.0569 

G-i- 

.0088 

.0288 

Tweets 

.0100 

.0266 

Berger et al. models 

Model 2 

- 

.0400 

Model 3 

- 

.0700 


Table 8: scores for the various linear models described and 

comparison with models presented in Berger et al. [4] 

Hence, when projecting emotions to their basic VAD dimen¬ 
sions, the models experience only a slight drop in terms of explana¬ 
tory power, while gaining the advantage of language-invariance. 

Moreover, our regression models for narrowcasting have greater 
explanatory power than the best performing model in [4]. This fact 
can be partially explained by the quality of our data: our datasets 
are much bigger and the affective annotations of the news articles 
were massively crowdsourced to the readers. 

Finally, emotions have significant effects on both narrowcasting 
and broadcasting, with a stronger impact on the former. Again, this 
finding is cross-cultural consistent. 


7. CONCLUSIONS 

In this article we have provided a comprehensive investigation 
on the relations between vitality of news articles and the emotions 
they are found to evoke. By exploiting a high-coverage and bilin¬ 
gual corpus of 65k documents containing metrics of their spread on 
social networks as well as a massive affective annotation provided 
by readers (more than 1.5 millions votes), we presented a thorough 
analysis of the interplay between evoked emotions and viral facets. 

We highlighted and discussed our findings in light of a cross- 
lingual approach. 

Our results show differences in evoked emotions and correspond¬ 
ing viral effects across the bilingual data we crawled; amongst other 
findings, we note the remarkably discordant influence of the SAD¬ 
NESS affective dimension between the english- and italian- lan¬ 
guage datasets, in contrast with the hypothesis that arousal alone 
can be used to explain virality phenomena. While these findings 
seem to indicate effects of cultural differences on the relation ex¬ 
isting between emotions and virality, such hypothesis could only 
be properly tested were more information on the annotators demo¬ 
graphics available (e.g. geographic provenance). 

Still, when accounting for the deeper constituents of emotions 
our analyses provided compelling evidence of a generalized ex¬ 
planatory model rooted in the Valence-Arousal-Dominance (VAD) 
circumplex. Viral facets seem to be coherently affected by partic¬ 
ular VAD configurations (namely, the alternation between Domi¬ 
nance and Arousal), and these configurations indicate a clear con¬ 
nection with distinct phenomena underlying persuasive communi¬ 
cation. In particular, high arousal is more connected to narrowcast¬ 
ing phenomena, while Dominance to broadcasting phenomena. 

Extensions of this work will include deeper investigations of the 
relations between VAD dimensions and virality. For instance, while 
in the analyses reported herein we have used a resource [34] to map 
affective tags to their VAD constituents, we plan to increase the 
resolution by associating each article with a VAD representation 
derived from its textual content. 

The results presented in this article can be very valuable in con¬ 
texts such as content marketing and native advertising, and thus be 
relevant not only for social science researchers interested in under¬ 
standing the factors behind virality phenomena, but also for mar¬ 
keting and industry people. 
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